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1 METHODOLOGY FOR PREDICTING AND/OR DIAGNOSING DISEASE 

2 Field of the Invention 

3 This invention in one aspect relates to a method for predicting and/or 

4 diagnosing diseases in living animals. The invention has particular utility in 

5 diagnosing and/or predicting future risk of specific diseases in living animals and will 

6 be described in connection with such utility, although other utilities are contemplated. 

7 This invention in another aspect relates to identification of markers for diseases or 

8 sub-clinical conditions that in the future may develop into diseases that are capable of 

9 distinguishing groups, and to subsets of these markers, where the utility of such 

1 0 markers can, for example, be determined by univariate, multivariate, or pattern 

1 1 recognition based analyses, and/or where the markers identified as important by the 

1 2 approach described also can be measured using other analytic approaches. The 

1 3 invention has particular applicability to predicting risk to cancer, type II diabetes, 

1 4 cardiovascular disease, cerebrovascular disease, and other diseases whose etiology has 

1 5 been established to or hypothesized to be modified by diet or nutrition, i.e. 

1 6 neurogenerative disorders such as Alzheimer's Disease, Parkinson's Disease and 

1 7 Huntington's Disease { 1 } , and will be described specifically in connection its utility 

1 8 for using serum or plasma metabolites for determining breast cancer risk; however, 

1 9 other utilities and other tissue or biological fluid samples (e.g., whole blood, 

20 cerebrospinal fluid, urine, and/or tissue samples) may be used instead of blood, and 

2 1 diseases and conditions other than breast cancer also can be addressed, as noted 

22 above. Similarly, in addition to disease, the assessment of nutritive status (over long 

23 or short term), may be utilized in accordance with yet another aspect of the present 

24 invention as a medical test under a variety of potential clinical settings, or in 

25 controlling epidemiological or pharmaceutical testing. Still other utilities, e.g. for 

26 detecting exposure to and/or sensitivity to exposure to toxins, are contemplated. 

27 Background of the Invention 

28 Dietary restriction (DR), i.e. underfeeding without malnutrition, has 

29 established efficacy in reducing both degenerative and neoplastic diseases. DR has 

30 been extensively explored since its first use in the 1930 f s because of its ability to 

3 1 extend both mean and maximum life span, reduce age-related morbidity, and delay or 
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1 prevent certain age-associated physiological dysfunction {2, 3 } . DR also alters many 

2 basic physiological processes, including metabolism, hormonal balance, and the 

3 generation of, detoxification of, and resistance to reactive oxygen species {4} . DR 

4 can be implemented in multiple ways {e.g. 5-13}. Moreover, restriction of total 

5 calories is believed to be more important than reducing intake of specific factors (e.g. 

6 fat, proteins, vitamins and minerals, etc. { 14, 15}). DR reportedly extends longevity 

7 in essentially all animals in which it has been tried, including multiple mammalian 

8 species (rat, mouse, guinea pig {2, 5-13, 16). Furthermore, promising data suggest 

9 that at least some of the benefits of DR, especially those regarding glucose 

1 0 metabolism, also occur in non-human primates {17-21}, and perhaps, in humans as 

1 1 well {22,23 } . Together, these observations suggest that the DR effect is robust in 

12 mammals. 

1 3 DR has been shown to reduce both incidence and severity of non-neoplastic 

1 4 diseases. One example is the efficacy of DR against glomerulonephritis, periarteritis, 

1 5 and myocardial degeneration in both male and female Sprague-Dawley rats. Similar 

1 6 observations have been made in other strains and other diseases, such as lung disease 

17 {25}. DR is also effective at preventing some strain specific disease, such as auto- 

1 8 immune disease in NZB/NZWF1 mice {26} and in MRL/lpr mice {27}, and 

1 9 atherosclerotic {28} and myocardial ischemia lesions in JCR:LA-cp mice {29} . 

20 DR also has been shown to reduce both incidence and severity of neoplastic 

2 1 diseases. DR-mediated reduction of neoplasia includes delayed onset of leukemia, 

22 pituitary adenomas, mammary and prostatic tumors, and hepatomas {30, 31}, 

23 Observations of the effects of DR on mammary tumors {32-36} are typical. DR acts 

24 to reduce breast cancer both by delaying onset (both by reducing initiation events and 

25 slowing promotion) and by slowing tumor progression {30} . In transgenic mice prone 

26 to mammary tumors, DR reduced tumor incidence by 67% {32} . This result reveals 

27 that DR is capable of overcoming genetic predisposition to breast cancer. Studies 

28 {33 } in rats treated with a carcinogen demonstrated that high fat and high calorie diets 

29 are co-carcinogenic, and that none of the rats maintained on 40% DR regimen 

30 developed mammary tumors, while 60% of AL-fed rats did. Concerns that this effect 

3 1 may have been partially mediated by reducing fat availability for tumor growth led to 
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1 later studies {34} . Despite a higher fat content in the DR diet, results show a 75% 

2 reduction in rats with mammary tumors and in the number of tumors per animal in the 

3 tumor-bearing group. Even more impressively, DR reduced total tumor yield, average 

4 tumor size, and mean tumor burden by 93-98%. Notably, Sinha et al demonstrated 

5 that even a 20% DR regimen reduces tumors by 65%, without effects on hormone 

6 levels or fertility {35}. 

7 Thus, DR mediated protection against breast cancer in laboratory models is: 

8 1 ) substantial (as much as 100% reduction in cancer rates {32}) and highly replicable 

9 {30-34} ; 2) robust and well-documented in a variety of animal models, including a 

10 model of genetic predisposition and a model of carcinogen exposure {31, 32}; 3) 

1 1 seen even with a more moderate (20%) restriction paradigm that does not affect 

12 fertility or hormone levels {34} ; 4) effective at multiple levels (initiation, promotion, 

1 3 progression). Thus, the present invention, in one aspect, is based on the observation 

1 4 that different subsets of markers that reflect DR are predictive for different diseases. 

1 5 For example, identifying markers, for example in sera, that reflect the DR phenotype, 

1 6 would lead to markers that would reflect risk of developing breast cancer, or other 

1 7 conditions affected by diet. 

1 8 Consistent with its broad effects on longevity and disease, DR is a systemic 

1 9 phenomenon, and its effects include measurable differences in blood constituents 

20 relative to those seen in ad libitum fed (AL) animals {37}. Many previous studies 

2 1 have focused on measurement of hormones. For instance, studies have shown 

22 alterations in plasma corticosterone patterns and levels {38}; some female 

23 reproductive hormones {39}, plasma chlecystokinin decreases 50% {40} ; T3 but not 

24 T4 is reduced {41 }; and plasma insulin drops as much as 60% in some DR models 

25 {42} . While informative, these studies have been somewhat limited by the technical 

26 complexity involved (e.g. circadian cyclicity, rapid response to stimuli). Other studies 

27 seeking more stable markers have examined markers of energy and free radical 

28 metabolism, revealing that DR decreases plasma glucose, ascorbate (e.g. 43-45) and 

29 glycohemoglobin levels {43}. Overall, the data indicates that differences in serotype 

30 distinguish AL and DR animals, and that these differences include some metabolites 

3 1 that are both relatively easy to assay and which reflect the beneficial effects of DR on 
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1 physiology, metabolism and free radical biology (e.g. generation, sensitivity, and 

2 detoxification). 

3 While not wishing to be bound by theory, since the AL and DR serotypes 

4 reflect robust physiological differences between these groups, it is believed that these 

5 serotypes include metabolites or metabolite profiles that cross-species and predict 

6 relative risk for the development of disease in humans. Data consistent with this 

7 concept comes from studies showing that the effect of DR on breast cancer is largely 

8 driven by chronic effects (termed promotion) rather than acute effects (termed 

9 initiation {30, 3 1 }). These data would imply that relative risk of developing breast 

1 0 cancer is likely reflected in general metabolism over long periods of time. Relative 

1 1 risk should thus be detectable in sera long before the development of overt disease. In 

1 2 the case of humans, who lie on a broad spectrum with respect to caloric intake, it is 

1 3 believed that closer fit to the AL serotype (i.e. the biological response typical of a high 

1 4 caloric intake) would predict higher relative risk of disease, whereas greater fit to the 

1 5 DR serotype (i.e. the biological response typical of a lower caloric intake) would be 

1 6 associated with reduced risk. While previous studies demonstrated differences 

1 7 between AL and DR animals, they were believed only able to look at specific, 

1 8 predetermined markers, making it essentially impossible to conduct a sufficiently 

1 9 broad and powerful search to identify markers of use for determining nutritional status 

20 or predicting health across species. 

21 Summary of the Invention 

22 The present invention provides a system, i.e. method and apparatus, for 

23 determining differences in concentrations of molecules, in particular small molecule 

24 metabolites, between animals whereby to create a metabolite database which may be 

25 used to reproducibly distinguish between two or more states of the health or the 

26 nutritive status of an animal. More particularly, the present invention employs 

27 analysis techniques to provide a small molecule inventory for metabolic pathway 

28 patterns of samples of ad libitum fed ( AL) and dietary restricted (DR) individuals 

29 whereby to reproducibly distinguish between different dietary status of animals, 

30 between health conditions of animals, and to reproducibly predict relative risk for the 

3 1 development of a particular disease in animals. 
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1 The basis for this approach is that sufficient specific, reproducible, measurable 

2 changes exist in the overall biochemistry of small molecule metabolites among the 

3 different states to reproducibly distinguish the two (or more) states of interest. 

4 Different entities and/or sub-sets or combinations of markers can be used to identify 

5 different diseases or sub-clinical conditions. An HPLC-electrochemical analysis 

6 based approach in accordance with U.S. Patent No. 4,863,873, which is incorporated 

7 herein by reference, has facilitated creation of a database for the constituents of AL 



8 and DR serum. 

9 Description of the Drawings 

1 0 For a fuller understanding of the nature and objects of the present invention, 

1 1 reference should be had to the following detailed description taken in conjunction 

1 2 with the accompanying drawings wherein: 

1 3 Figure 1 is a chromatographic method pump profile in accordance with the 

1 4 present invention; 

1 5 Figures 2A-2C are array chromatography of serum samples in accordance with 

1 6 the present invention; 

1 7 Figure 3 is a table of biochemically identified serum metabolites in accordance 

1 8 with the present invention; 

1 9 Figure 4 is a bar graph of biochemically differentiated serum metabolites in 

20 accordance with the present invention; 

21 Figures 5 A and 5B are dendograms and Figures 5C and 5D are PCA patterns 

22 of biochemically differentiated serum metabolites in accordance with the present 

23 invention; and 

24 Figure 6 is a table of biochemically identified subsets of serum metabolites in 

25 accordance with the present invention. 

26 Detailed Description of Preferred Embodiment 

27 Methodology for Sample Analysis and Database Creation 

28 Sample preparation: 

29 Blood was collected from male Fischer 344 rats by terminal exsanguination 

30 following decapitation in accordance with standard animal usage guidelines. Samples 
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1 were placed on ice for 30 minutes, centrifiiged, and the resulting sera snap frozen in 

2 liquid nitrogen and stored at minus (-) 80°C until analysis. 

3 Samples were precipitated and extracted in four vol of acetonitrile(An)/0,4% 

4 acetic acid(HAc) at -20°C. One ml of centrifiiged supernatant was removed, 

5 evaporated to dryness under vacuum, and reconstituted in 200 ml of a Mobile Phase A 

6 as described below. This protocol conserves reactive species such as ascorbate, and 

7 homogentistic acid at 1 ng/ml concentrations. 100 ml reconstituted extract was placed 

8 in each of two auto sampler vials, one immediately analyzed and the other frozen at - 

9 80°C for future confirmation analysis. Prior to injection, samples were maintained at 

10 4°C. 

1 1 Mobile Phases: Chromatographic solvents include isopropyl alcohol, 

1 2 methanol, acetonitrile, lithium hydroxide, glacial acetic acid, and pentane sulfonic 

1 3 acid. To retain stability of retention times and response potentials, a novel mobile 

1 4 phase pair was developed: Mobile Phase A ( 1 1 g/1 of PSA at pH 3.00 with acetic 

1 5 acid) and Mobile Phase B (0.1M LiAc at pH 3.00 with acetic acid in 80/10/10 

1 6 methanol/ An/ isopropanol). PSA demonstrates an improved ability to solubilize and 

1 7 remove protein and peptide fragments from both HPLC (CI 8) columns and 

1 8 coulometric detectors while the high organic modifier (Mobile Phase B) effectively 

1 9 removes residual lipids and polysaccharides. Sulfonic acids are, however, inherently 

20 contaminated necessitating a cleaning protocol in which the PSA/HAc concentrated 

21 buffer (4 1 of 400g/l PSA) was electrolyzed over pyrolytic graphite at a potential of 

22 1000mVvsPd(H). 

23 Chromatographic Methods : Referring to Fig. 1 , the chromatographic method 

24 involves a 120 min complex gradient from 0% Mobile Phase B to 100% Mobile Phase 

25 B, with flow rate adjusted to compensate for aziotropic viscosity effects. Gradient 

26 operation was provided by two Schimadzu LC- 10AD HPLC pumps. Despite 

27 meticulous precleaning protocols, and the use of highly purified solvents and selected 

28 organic modifiers, spurious peaks occur late in the gradient. This problem was 

29 addressed by developing a device based on electrochemically activated porous carbon 

30 with sorption characteristics similar to CI 8. A prototype peak suppresser/gradient 

3 1 mixer (PS/GM) was placed in stream before the. HPLC injector. The PS/GM mixer 
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1 incorporated a 2 cm length of a 1 cm diameter CI 8 precolumn integral with a 2.5 cm 

2 section of rod with flow interrupting grooves that serve to trap and spread mobile 

3 phase contaminants. When these were released to the grooved section, during the 

4 gradient run, they were mixed to a peak width at a half height of ca. 1 40 sec. This 

5 effectively reduced a mobile phase derived contaminant signal to a wave that was later 

6 eliminated during data reduction. The mixed gradient was delivered from the PS/GM 

7 to a PEEK lined pulse damper prior to flowing through the auto sampler injector and 

8 on to the C 1 8 columns. Sample extracts were separated on dual PTFE lined HR80 

9 columns containing 3-mm ODS particles and measuring 80 mm x 4.6 mm I.D. 

10 Analyte detection was accomplished with a NCA Chemical Analyzer, Model 

1 1 CEAS multiple electrode electrochemical detection system, available from ESA, Inc., 

1 2 of Chelmsford, Massachusetts. The latter includes an ESA Model 6210 analytical cell 

1 3 and a 1 6-channel coulometric electrode array incremented from - 1 OOmV to +940mV 

1 4 to detect both reducible and oxidizable compounds. PS/GM, pulse damper, columns, 

1 5 and detectors are contained within a temperature controlled enclosure maintained at 

1 6 35°C. System functions were controlled by the ESA, Inc. Model 4.1 2C CEAS 

1 7 software installed on a 386 microcomputer networked to remote 486-based computers 

1 8 where data storage, reduction and analysis were accomplished. CEAS analysis 

1 9 software-produced reports were imported to spreadsheet/database software for further 

20 statistical analysis and reports. 

21 Data Reduction. Observation and Analysis: Chromatographic retention times, 

22 monitored by pure standards and identified sample compounds, do not vary more than 

23 1%. The absolute qualitative channel ratio responses do not vary by more than 20% 

24 and were controlled for by inclusion of authentic standards to within 5%. Where 

25 possible, sample chromatographic peak identities were confirmed by spiking with the 

26 relevant authentic standard. Final confirmation was made by comparison of the 

27 matching ratio (R) of the standard and the sample peaks. R represents the ratio 

28 between the dominant oxidation channel and juxtaposition subdominant channels. A 

29 given compound is oxidized at a specific potential and therefore any compound can be 

30 described by a retention time and a potential. In practice, compounds were oxidized 

31 on a dominant detector set near its oxidation potential and exhibited a smaller 
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1 response on the prior and following detector. The ratio exhibited between the 

2 dominant and adjacent detector responses was characteristic of a given compound and 

3 variations from that ratio, when a standard was close in concentration to a sample 

4 compound, indicated a co-eluting contaminant. 

5 Data from each detector analog signal was converted and combined with other 

6 detector data to construct a time-potential map, which was compared with standards 

7 and between samples. Analytical values were calculated for sample peaks based on 

8 matches under restrictions for retention time, detector channel ratios and, to a lesser 

9 degree, peak heights, according to priority optimized by the analyst over sequential 

10 monitored analysis. Where compound identity is known, final results were calculated 

11 as ng per ml of sample based on standard responses. 

1 2 To automate analysis, a compound table was generated from a pool of multiple 

1 3 samples in a cohort with concentrations defined as 100. Subsequent sample analysis 

1 4 generates reported values as percentage of pool values. This table was used to analyze 

1 5 (initially with manual oversight, then automatically) all other pools and a few samples 

1 6 within the study. The CE AS analytical software has a built in "learning" capacity, 

1 7 which is inherently part of the "standards" definition function of the analysis. As the 

1 8 operator oversaw a few analyses, decisions were made about parameters such as 

1 9 referencing retention times to other compounds or what degree of variation from the 

20 channel ratio's will be tolerated. Conflicts and ambiguity in analysis were monitored 

2 1 and resolved during this test phase of the analysis. Eventually, the pool standard table 

22 will "learn" how reliably to find a majority of the potential analytes in the samples. 

23 Typically >400 compounds were resolved in plasma at the 20 nanoampere gain. 

24 Reported values were captured in a file suitable for downloading into a database. 

25 Example I 

26 The use of complex HPLC separations, coupled with coulometric array 

27 detectors, enables simultaneous quantitation of >400 compounds from serum (Figure 

28 2A). The combination of retention time (Figure 2B) and ratio of response across 

29 adjacent detectors (Figure 2C) in the array enables reproducible identification of a 

30 given peak in multiple runs and comparison of samples of interest such as sera from 

3 1 AL and DR rats. In all, -70 biochemically identified compounds and 350+ currently 
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1 unidentified compounds were reproducibly measured using these techniques. See 

2 Table I, Fig. 3. 

3 HPLC separations coupled with coulometric array detection 

4 Data was initially generated by CEAS/Coularray systems in the form of a set 

5 of 1 6 chromatograms (one for each detector). Figure 2 A shows approximately one- 

6 fifth of a total chromatogram, including -70 independent, identifiable and quantifiable 

7 peaks, from a 6-month old male Fischer 344 rat. Sensor potentials ranged from T, - 

8 100 mv to Ti6+940mv. Results were shown at an intermediate gain (200 nA). The x 

9 axis is retention time, y-axis is the magnitude of the response, the 16 parallel traces 

1 0 represent the 1 6 detectors of the array from 1-16 (bottom to top). Figure 2B shows a 

1 1 later section of the chromatogram from 3 AL rats (top three traces) and 3 DR rats 

1 2 (bottom three traces). For clarity, only data from channel (detector) 8 is shown (gain 

1 3 = 500 nA). Arrows indicate two metabolites that are decreased by DR. Figure 2C 

1 4 shows the region of the chromatogram from Figure 2 A (compound 123, see Figure 4) 

1 5 from one AL (top) and one DR (bottom) animal (gain 15 uA). As in Figure 2A, the 

16 16 parallel traces represent the 16 detectors of the array from 1-16 (bottom to top). 

1 7 Note that the ratio of response across the detectors is constant. 

1 8 Application of this technology to the study of sera from AL and DR rats has 

1 9 revealed 34 compounds that differ between these groups (Figure 4). Of these 34 

20 compounds, 6 are reproducibly altered in both 6 and 12 month rats, and at least five of 

2 1 these six are also altered in 1 8 month rats. The remaining 28 markers include some 

22 with apparent age-specificity and others whose validity is still under investigation. 

23 These markers, which were originally identified in 6-month old AL and DR rats, 

24 differ sufficiently between AL and DR groups to separate animals into the correct 

25 dietary group by both hierarchical cluster analysis and principal component analysis 

26 (Figure 5A and 5B). 

27 To verify feasibility, the HPLC system described above was used to determine 

28 the relative levels of 217 metabolites from the sera of 6 month old male AL and DR 

29 Fischer 344 rats. Analysis revealed 22 metabolites that differed between AL and DR 

30 rats by t-test without consideration for the Bonferroni correction (See Figure 4). 

3 1 These 22 compounds (see Table II, Figure 6) became the primary variables of interest 
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1 in a follow-up study (N=8/group, 1 2 month AL and DR Fischer 344 rats). Analysis of 

2 these data confirmed statistical significance of 6 of these 22 compounds (marked by 

3 asterisks in Figure 4). Furthermore, five of these six also statistically differ between 

4 18 month old AL and DR rats (p values <0.02, O.002, O.001, O.0002, 0.0001); 

5 the sixth (metabolite #7 1 which was determined to be homovanillic acid) showed a 

6 similar trend, but p>0.05 (P<0. 1, suggesting increasing "N" likely will yield statistical 

7 significance). The remaining 16 compounds, as well as 12 compounds that were 

8 statistically significant only in the 12 month samples, likely included some that are 

9 type I statistical errors, some that may be statistically significant when "N" is 

1 0 increased (P currently <0.8 for many, some of which approach statistical significance 

11 in the second age group), and some metabolites may only reflect the DR phenotype at 

1 2 specific ages. Further experiments using the methods described can be used to 

1 3 distinguish between these possibilities, and also to identify other markers of interest. 

1 4 Also, another compound was found to decrease >99% following short term caloric 

1 5 restriction. 

1 6 As will be seen from the foregoing Example, alteration of the dietary paradigm 

1 7 on which animals are maintained can be used to develop specialized patterns or 

1 8 profiles. As examples, tests of male and female rats of different ages enable 

1 9 identification of age- and sex-dependent and -independent profiles associated with 

20 DR. Specific changes in the duration and extent of DR feeding regimens enable 

2 1 generation of an extended metabolic database relating markers to long- and short-term 

22 caloric intake and balance. 

23 Similarly, the resulting data can be analyzed using univariate statistics (e.g., t- 

24 tests), multivariate statistics (e.g., ANOVA) or other multivariate analysis 

25 (hierarchical cluster analysis, principal component analysis) or through the use of 

26 pattern recognition algorithms to qualitatively and quantitatively identify metabolic 

27 profiles and relationships. 

28 Serum Markers for DR 

29 Referring to Figure 4, sera samples from male Fischer 344 rats were run on an 

30 ESA Model CEAS as described above. Sera from 6-month old and 12 month old AL 

31 and DR rats were analyzed (N= 8/group). Data was expressed as the percentage of the 
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1 level of analyte present in the sera of one of the 6-month old AL rats. Bars to the left 

2 of the vertical line represent compounds that differ statistically between 6 month old 

3 AL and DR rats; those bars to the right represent compounds that differ statistically 

4 between 12 month old AL and DR rats. Asterisks mark the 6 compounds that differ 

5 statistically in both groups (bars show only 6 month data; p values below are the value 

6 at 6 months). Out of 2 1 7 analytes quantified to date, 34 show p values <0.05 prior to 

7 Bonferroni corrections, (uncorrected p values, in order {left of line} p < 0.0008, 

8 0.0008, 0.001, 0.001, 0.005, 0.0073, 0.0089, 0.0091, 0.012, 0.012, 0.013, 0.014, 

9 0.017, 0.017, 0.017, 0.019, 0.023, 0.026, 0.026, 0.037, 0.04, 0.05; {right of line} p < 

10 0.0017, 0.0027, 0.003, 0.0075, 0.01 1, 0.014, 0.014, 0.016, 0.023, 0.034, 0.035,0.04). 

1 1 Observations: 

1 2 The data in Figures 2 and 4 show that it is possible to identify metabolic 

1 3 differences in known groups; Figure 5 shows the reciprocal that the metabolic 

1 4 profiles generated by coulometric array technology include sufficient information to 

1 5 identify the group to which a sample belongs. Thus, metabolic profiles reflective of 

1 6 long term DR may be used to group human samples, and the groups generated may in 

1 7 turn reflect the samples' identity (e.g., women who later developed breast cancer vs 

1 8 women who remained cancer free), and persons at high risk for development of 

1 9 disease vs persons at low risk for development of disease). 

20 There are five components linking the methodology of the present invention to 

2 1 its utility. The first is the ability to identify an animal system in which disease 

22 frequency is reproducibly reduced. This is accomplished by using the dietary 

23 restricted rats, which have robustly increased longevity and decreased morbidity as 

24 compared with their ad libitum fed counterparts. The second is a methodology that 

25 enables us to capture serum components that differ between ad libitum and dietary 

26 restricted rats. Direct evidence for the utility of our invention to complete this 

27 component is shown in Figures 2B, 2C, 4 and 5. The third is based on the observation 

28 that the metabolites identified are sufficient to group animals by caloric intake. This 

29 is shown in Figure 5. The fourth component is based on the observation that at least 

30 some of the markers (metabolites) identified in non-human species can be identified 

3 1 in humans. This is true because of the overall similarity between the metabolism of 
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1 all mammals. Direct confirmation has been previously demonstrated by Milbury et al 

2 in their comparative studies of the bear and humans {46} . Finally, the fifth 

3 component is the ability of these markers, or subsets of them, to predict disease risk or 

4 diagnose disease in humans. This follows from the general similarity of metabolism 

5 between mammals, the strong association of many human diseases with caloric intake 

6 (e.g., some cancers, type II diabetes, cardiovascular and cerebrovascular diseases), and 

7 the established efficacy of DR against most forms of morbidity. Furthermore, the 

8 method for determining which subsets of markers have utility includes generation and 

9 verification of markers in animals coupled with testing these markers in human 

1 0 populations using methods developed for human epidemiology. Intermediate steps, 

1 1 such as testing multiple patterns in humans with defined nutritional intake, may be 

1 2 used to facilitate and strengthen the approach. 

1 3 Figure 5 shows the grouping of the sera samples from 6 and 12 month old rats 

1 4 based on the metabolites that were identified as differing between 6-month old AL 

1 5 and DR rats. The dendrograms in Figure 5 (panels A and B) were generated using the 

1 6 hierarchical cluster analysis package from the Einsight data analysis package. 

1 7 Hierarchical cluster analysis is a method of data analysis that emphasizes the natural 

1 8 groupings of the data set. In contrast to analytical methods that emphasize 

1 9 distinguishing differences between two groups, hierarchical cluster analysis uses 

20 algorithms that reduce complex data sets to establish these groups without 

21 preconceived divisions. In this dendrogram, relative similarity within the total study 

22 population increases as one moves from right (0.0) to left (1 .0, biochemical identity) 

23 on the horizontal axis. The smaller the distance is from identity (left side) to the point 

24 at which two samples (groups) are linked by a vertical line, the greater the relatedness 

25 of the two samples (groups). Alternatively, the closer the split between two samples 

26 is to the right of the figure, the greater the disparity between two samples or groups of 

27 samples. 

28 Additional analyses were also conducted using Eigenvector or principal 

29 component analysis (PCA), which determines those analytes that contribute most 

30 heavily to the separation of groups (panels C and D of Figure 5). In this type of 

3 1 analysis, the two PCA components that were most significant at explaining the 
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1 variation in the database are termed PC 1 and 2, respectively. Relative mathematical 

2 values were assigned to the two groups of analytes that best discriminate the data set 

3 (PC-1 and PC-2, exact values are arbitrary). A scattergram then was plotted using the 

4 PC- 1 value for the X axis and the PC-2 value for the Y-axis. In the context of the 

5 current invention, principal component (Eigenvector) analysis enabled us to identify 

6 which of the multiple compounds that may differ between AL and DR animals were 

7 the most useful for classification purposes. This analysis also gives a means of 

8 estimating the consequences of removing different analytes from the profiles. This 

9 type of analysis permits us readily to determine which analytes contribute the most to 

1 0 our ability to distinguish members of one group from members of another (e.g., 

1 1 humans at high risk for developing a specific disease vs humans not at high risk for 

1 2 developing that disease). 

1 3 As shown in Figure 5, data of sufficient power can be generated such that both 

1 4 hierarchical cluster analysis and principal component analysis were able to separate 

1 5 the rat sera by dietary group in both the initial cohort of 6 month old rats (with 1 00% 

16 accuracy, Figure 5 A and 5C) and two independent cohorts of 12 and 1 8 month rats 

1 7 (with >85% accuracy, Figures 5B and 5D. The initial group confirms a series of 

1 8 markers that, by themselves, retain a sufficient fraction of the information present in 

1 9 sera to enable one to correctly identify the origin of the samples. More importantly, 

20 the studies in the two independent data sets reveal that the data is able to identify a 

2 1 series of markers with sufficient power to correctly identify >85% of unknown, 

22 independent samples. Equally successful separation was achieved at all three ages 

23 regardless of whether all 22 markers were used or just the 6 markers that differed in 

24 both 6 and 12 month samples. Misclassifications were limited to a small subset [2-4 

25 rats] of the cohort, and were dependent on the markers used (6 or 22) and the exact 

26 algorithms used to conduct the analysis. 

27 Serum Markers Distinguish AL and DR Rats 

28 The 22 serum metabolites identified as potential markers in 6 month old AL 

29 and DR rats (Figure 4, left of vertical line) and the 6 markers shown to be replicable 

30 in 6 and 12 month old rats (Figure 4, asterisks) were used to determine groupings of 3 

31 sets of AL and DR rats (6, 12, and 18 months, 18 month data not shown). Rat 
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1 designations (e.g., Al) are consistent within age groups (vertically, e.g., Al in Figs. 

2 5A and 5C are the same rat, but Al in Figs. 5A and 5B are not). Both hierarchical 

3 cluster analysis (A,B) and principal component (Eigenvector) analysis (C,D) of the 

4 data are shown. (A) Dendrogram of analysis of the sera from 14 6 month old rats. 

5 All 22 compounds were used to determine the natural groupings, but similar results 

6 were also obtained using only the 6 replicable markers. (B) Dendrogram of analysis 

7 of the sera from 15 12 month old rats (independent test set). All 22 compounds were 

8 used to determine the natural groupings. Similar results were also obtained using only 

9 the 6 replicable markers and in samples from 18 month old rats. (C) Principal 

1 0 component analysis of sera from the 14 6 month old rats using all 22 markers. Similar 

1 1 results were also obtained using only the 6 replicable markers. (D) Principal 

1 2 component analysis of the sera from the 1 5 12 month old rats in the independent test 

1 3 set using the 6 replicable markers. Similar results were also obtained using all 22 

1 4 markers as well as in samples from 1 8 month old rats. All analysis was based on first 

1 5 pass data — meaning that the HPLC data analysis software required no further training 

1 6 and no human intervention to collect data of sufficient quality to distinguish AL and 

1 7 DR rats. 

1 8 The data presented in Figures 2, 4 and 5 demonstrate that the present invention 

1 9 permits identification markers that reproducibly differ between AL and DR rats, and 

20 that metabolite profiles based on these markers are sufficiently powerful to assign sera 

2 1 samples into correct dietary groups by hierarchical cluster analysis and principal 

22 component analysis with >85% accuracy even when these phenotypes may be 

23 partially obscured by age-related and/or individual variation. Increasing the "N" will 

24 readily increase the accuracy and power of these results by generating larger, and thus 

25 more informative, training sets, and by increasing the signal-to-noise ratio by 

26 removing noninformative metabolites from the profiles. Furthermore, building 

27 extended databases using rats maintained on specifically modified feeding regimens 

28 will enable one to parse out metabolites and metabolic profiles to increase power (e.g., 

29 one can identify markers that reflect a short term diet and distinguish those which 

30 reflect a truly long term reduced caloric intake). Both of these sets of markers may 

31 have utility for different uses. Finally, the data obtained can be analyzed by 



14 



WO 99/50437 



PCT/US99/06762 



1 univariate, multivariate, or pattern recognition based analyses, and that these analyses 

2 may detect utility not seen with other analyses. 

3 It thus appears that HPLC with coulometric-array detectors advantageously 

4 may be employed to identify specific chemical markers, i.e. metabolites, sets of 

5 metabolites, and/or metabolic profiles (detected in sera or other biological samples) 

6 that separates AL from DR rats or other animals, and that such metabolites, sets of 

7 metabolites, or metabolic profiles in turn may be used to diagnose or predict disease 

8 states or future risks of diseases. Such diseases may include degenerative diseases 

9 such as diabetes, in particular, type II diabetes, cardiovascular disease, stroke, heart 

10 attack, cerebrovascular disease, and other diseases whose etiology has been 

1 1 established to or hypothesized to (e.g., Alzheimer's { 1 }) be modified by diet or 

1 2 nutrition, although utility in other diseases is also considered, including, neoplastic 

1 3 and non-neoplastic diseases, such as breast cancer, colon cancer, pancreatic cancer, 

1 4 lymphoma, prostrate cancer and leukemia, neurological diseases, neurodegenerative 

1 5 diseases, autoimmune diseases, endrocrine diseases, renal disease, Huntington's 

16 disease, Parkinson ! s disease, Lou Gehrig's disease, and the like, as well as sensitivity 

1 7 to toxins, e.g. industrial and/or environmental toxins. Moreover, applying the 

1 8 technique of the present invention to a larger number of samples will permit one to 

1 9 observe greater number of chemical pattern characteristics, and to identify new 

20 chemical patterns and/or new markers specific to particular diseases and/or sub- 

21 clinical conditions that in the future may develop into a specific disease. In turn, this 

22 may permit early intervention and thus possibly head off the development of the 

23 disease. The invention also advantageously may be employed for diagnosing other 

24 disease conditions, or sub-clinical conditions, i.e. before observable physical 

25 manifestations, that in the future may develop into disease conditions. Similarly, in 

26 addition to disease, the assessment of nutritive status may be useful as a medical test 

27 under a variety of potential clinical settings, or in controlling epidemiological or 

28 pharmaceutical testing, although other utilities are contemplated. 
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1 CLAIMS 

2 1 . In a method for diagnosing and/or predicting disorders in which 

3 biological samples are analyzed to generate frequency distribution patterns 

4 representative of molecular constituents of said samples, the improvement which 

5 comprises comparing frequency distribution patterns of constituents of samples of ad 

6 libitum-fed and dietary-restricted individuals. 

7 2. A method according to claim 1 , wherein said samples comprise body 

8 fluids. 

9 3. A method according to claim 2, wherein said body fluids are selected 

1 0 from the group consisting of serum, plasma, platelets, saliva and urine. 

11 4. A method according to claim 1, wherein said disorder is selected from 

12 the group consisting of neoplastic or non-neoplastic disease, cardiovascular or 

1 3 cerebrovascular disease, renal disease, autoimmune disease, neurological or 

1 4 neurogenerative disease, endocrine disease, and diabetes. 

15 5. A method according to claim 1, wherein said disorder is selected from 

1 6 the group comsisting of breast cancer, colon cancer, pancreatic cancer, lymphoma, 

1 7 prostrate cancer and leukemia. 

18 6. A method according to claim 1 , wherein said disorder comprises 

1 9 glomerulonephritis. 

20 7. A method according to claim 1 , wherein said disorder comprises 

21 periarateris. 

22 8. A method according to claim 1, wherein said disorder is selected from 

23 the group consisting of myocardial degeneration, heart disease and stroke. 

24 9. A method according to claim 1 , wherein said disorder comprises 

25 altherosclorosis. 

26 10. A method according to claim 1, wherein said disorder comprises 

27 pituitary adnoma. 

28 1 1 . A method according to claim 1 , wherein said disorder comprises type II 

29 diabetes. 

30 1 2. A method according to claim 1 , wherein said disorder comprises 

3 1 sensitivity to toxins. 
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1 1 3. A method according to claim 1 , wherein said comparison is conducted 

2 using univariat statistics. 

3 14. A method according to claim 1, wherein said comparison is conducted 

4 using multivariat statistics. 

5 1 5. A method according to claim 1 , wherein said comparison is conducted 

6 using hierarchical cluster analysis. 

7 1 6. A method according to claim 1 , wherein said comparison is conducted 

8 using principal component analysis. 

9 1 7. A method according to claim 1 , wherein said comparison is conducted 

1 0 using pattern recognition algorithms to qualitatively and quantitatively identify 

1 1 metabolic profiles and relationships. 

1 2 18. A method according to claim 1 , wherein said biological samples 



1 3 comprise electrochemically active compounds, and including the steps of passing said 

1 4 fluid samples sequentially through a liquid chromatographic column for achieving 

1 5 time-space separation of the materials eluting from the column, and an 

1 6 electrochemical detection apparatus whereby to generate electrochemical patterns of 

1 7 said electrochemically active compounds. 

18 19. A method according to claim 1 8, including the step of separating said 

1 9 electrochemically active compounds by electrochemical characteristics in said 

20 electrochemical detection apparatus. 
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